ghc パターンマッチの時間計算量

土 07 4月 2018

reddit で見かけて, ふと気になったのでメモ. GCC で C/C++ コードのswitch文およびcase節をコンパイルするとき, case節の数が一定以上を超えると, ジャンプテーブルを利用したアセンブリが吐き出される1. 同様にして, ghc はパターンマッチでジャンプテーブルが用いられる場合がある.
以下, メーリングリスト2から, パターンマッチの時間計算量に関する言及について一部引用.

(snip) To answer you question, O(1) for simple patterns, but it really depends on the complexity of the pattern-matching expression and the Core-to-Core transformations that GHC applies. To truly understand the complexity, you need take a look at the Core/STG dump (I prefer STG since it’s simple). If you have any particular code samples you’d like me to help you analyze, I’d be happy to do so.
A basic example:
data Color = Red | Blue | Green
isRed Red = True
isRed _ = False
GHC will transform this to
isRed x = case x of { Red -True; DEFAULT -False }
You can think of a case as a switch expression in your standard imperative languages. A case expression will evaluate the thunk ‘x’ and perform a switch on the tag of the result. Each data constructor has an integer tag associated with it which will be the target of the switch. So the time complexity of `isRed` will be the time complexity of thunk evaluation which is impossible to predict because a thunk can be incredibly complex. Lazy evaluation is not so easy to analyze. It is highly context-sensitive.(snip)
The way you’re measuring algorithmic complexity does carry over to the lazy setting provided it’s single-threaded because the order of execution is purely determined by the STG Code. In the concurrent lazy setting, it’s a bit trickier since lightweight locking mechanisms occur when multiple threads evaluate the same thunk, making it non-deterministic.

関連URL: